WEB PARSER
 

WebParser reads information from webpages. The plugin uses regular expressions to parse the web page which allows it to extract information pretty much from any page. The plugin can be used e.g. to get the current TV shows, weather conditions, stock exchange values, news and basically anything that is on the net. The negative side is that the regular expressions might look rather complex especially if you're not familiar with programming languages (and even if you are :-).

Url
Url to the file to be downloaded and parsed. The Url can also be another WebParser-measure, in which case the already downloaded and parsed information can be reused (e.g. when displaying different StringIndex on the same page). To do this just give the name of the measure in the Url, like this: Url=[MeasureSlashDot]

RegExp
The regular expression used in parsing. The plugin uses Perl Compatible Regular Expressions, so check the Perl docs for syntax and more info.

FinishAction
Action that is executed when the page has been downloaded and the parsing is done.

StringIndex
Defines which string from the regexp this measure returns. You can get the correct index values by setting the Debug=1, which will add all matched strings to the log-file.

StringIndex2
The second string index is used when using a RegExp in a measure that uses data from another webparser measure (i.e. the Url points to a measure and not to a real URL). In this case the StringIndex defines the index of the result of the other RegExp and the StringIndex2 defines the index of this measure's RegExp (i.e. it defines the string that the measure returns). If the RegExp is not defined in this measure the StringIndex2 has no effect.

UpdateRate
The rate how often the webpage is downloaded. This is relative to the config's main update rate. It is advisable to limit the rate so that you're not flooding the server with constant requests. The web server admins will not like it and you probably get banned from the server altogether if you try to poll the server too often. So, if the main update rate is 1000 (i.e. one second, which is the default) set this e.g. to 60 to read the webpage once per minute.

Debug
Set this to 1 and the log file will contain some useful debug information. Value 2 dumps the downloaded webpage to C:\WebParserDump.txt. This can be useful since some web servers send different information depending which client requests it. Remember to remove this from your config once you have it working correctly.

Download
If set to 1, the Url is downloaded to a temporary folder and the name to the file is returned as string value. The measure can be bound to a IMAGE meter to download images from web and show them. If the RegExp is defined the parsed string is interpreted as a link to the downloaded image.

ErrorString
String that is returned in case there is a parse error.

Proxy
Name of the proxy server. The plugin doesn't support any authentication so it's possible to use only servers that does not require it or you need a some different way to authenticate yourself to the proxy server.

CodePage
Defines the codepage of the downloaded web page. For example CodePage=28605 interprets the page as Latin 9 (ISO-8859-15). If the CodePage is set to 0 no conversion is done. CodePage=65001 means UTF-8. You can check other Windows code pages from here.


Examples

Display the title and first item from Slashdot's RSS feed.

[MeasureRSSTitle]
Measure=Plugin
Plugin=Plugins\WebParser.dll
UpdateDivider=3600
Url=http://slashdot.org/slashdot.rdf
RegExp="(?siU)<title>(.+)</title>(.+)<item(.+)<title>(.+)</title>"
StringIndex=1
FinishAction=!RainmeterRedraw

[MeasureRSS]
Measure=Plugin
Plugin=Plugins\WebParser.dll
Url=[MeasureRSSTitle]
StringIndex=4
Substitute="&amp;":"&"


Download the current weather map from Finnish Meteorological Institute.

[MeasureDL]
Measure=Plugin
Plugin=Plugins\WebParser.dll
UpdateRate=1800
Url=http://www.fmi.fi/saa/sadejapi_5.html?selected=7
RegExp="(?siU)ennuste" src=\"(.*)\"></td>"
FinishAction=!RainmeterRedraw
StringIndex=1
Download=1

[MeterImage]
Meter=IMAGE
MeasureName=MeasureDL
X=0
Y=0
W=400
H=300